Very short utterances in conversation

نویسندگان

Jens Edlund

Mattias Heldner

Samer Al Moubayed

Agustín Gravano

Julia Hirschberg

چکیده

Faced with the difficulties of finding an operationalized definition of backchannels, we have previously proposed an intermediate, auxiliary unit – the very short utterance (VSU) – which is defined operationally and is automatically extractable from recorded or ongoing dialogues. Here, we extend that work in the following ways: (1) we test the extent to which the VSU/NONVSU distinction corresponds to backchannels/non-backchannels in a different data set that is manually annotated for backchannels – the Columbia Games Corpus; (2) we examine to the extent to which VSUS capture other short utterances with a vocabulary similar to backchannels; (3) we propose a VSU method for better managing turn-taking and barge-ins in spoken dialogue systems based on detection of backchannels; and (4) we attempt to detect backchannels with better precision by training a backchannel classifier using durations and inter-speaker relative loudness differences as features. The results show that VSUS indeed capture a large proportion of backchannels – large enough that VSUs can be used to improve spoken dialogue system turntaking; and that building a reliable backchannel classifier working in real time is feasible.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison between Three Methods of Language Sampling: Freeplay, Narrative Speech and Conversation

Objectives: The spontaneous language sample analysis is an important part of the language assessment protocol. Language samples give us useful information about how children use language in the natural situations of daily life. The purpose of this study was to compare Conversation, Freeplay, and narrative speech in aspects of Mean Length of Utterance (MLU), Type-token ratio (TTR), and the numbe...

متن کامل

A Quantitative View of Short Utterances in Daily Conversation: A Case Study of Thats right, Thats true and Thats correct

Short utterances serve a multitude of different communicative functions in interactive speech and have attracted due attention in recent research in dialogue acts. This paper presents a quantitative description of three short utterances i.e. that’s right, that’s true, that’s correct and their variations based on the Switchboard Dialogue Act Corpus. Particularly, it offers an overview to account...

متن کامل

I-Vector/PLDA Variants for Text-Dependent Speaker Recognition

The i-vector/PLDA approach currently dominates the field of text-independent speaker recognition and the question of how to translate this methodology to the text-dependent domain has recently become an active area of research. The essential difference between the two fields is that it is possible to do speaker recognition with enrollment and test utterances of very short duration in the text-d...

متن کامل

Agreement and disagreement utterance detection in conversational speech by extracting and integrating local features

This paper presents a novel framework to automatically detect agreement and disagreement utterances in natural conversation. Such a function is critical for conversation understanding such as meeting summarization. One of the difficulties of agreement and disagreement utterance detection in natural conversation is ambiguity in the utterance unit. Utterances are usually segmented by short pauses...

متن کامل

Casual Conversation Technology Achieving Natural Dialog with Computers

In recent years, voice recognition agents such as NTT DOCOMO’s “Shabette Concier” have become popular. ShabetteConcier is a voice agent capable of responding to task-related utterances such as “send mail” or “call,” and can answer questions such as “how high is Mount Fuji?” or “what is the highest mountain in the world?” It can also respond to casual conversation with utterances such as “I love...

متن کامل

Comparing word, character, and phoneme n-grams for subjective utterance recognition

In this paper, we compare the performance of classifiers trained using word n-grams, character n-grams, and phoneme n-grams for recognizing subjective utterances in multiparty conversation. We show that there is value in using very shallow linguistic representations, such as character n-grams, for recognizing subjective utterances, in particular, gains in the recall of subjective utterances.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Very short utterances in conversation

نویسندگان

چکیده

منابع مشابه

A Comparison between Three Methods of Language Sampling: Freeplay, Narrative Speech and Conversation

A Quantitative View of Short Utterances in Daily Conversation: A Case Study of Thats right, Thats true and Thats correct

I-Vector/PLDA Variants for Text-Dependent Speaker Recognition

Agreement and disagreement utterance detection in conversational speech by extracting and integrating local features

Casual Conversation Technology Achieving Natural Dialog with Computers

Comparing word, character, and phoneme n-grams for subjective utterance recognition

عنوان ژورنال:

اشتراک گذاری